Recent advances in speech fragm
نویسندگان
چکیده
This paper addresses the problem of recognising speech in the presence of a competing speaker. We employ a speech fragment decoding technique that treats segregation and recognition as coupled problems. Data-driven techniques are used to segment a spectro-temporal representation into a set of spectro-temporal fragments, such that each fragment is dominated by one or other of the speech sources. A speech fragment decoder is used which employs missing data techniques and clean speech models to simultaneously search for the set of fragments and the word sequence that best matches the target speaker model. The paper reports recent advances in this technique, and presents an evaluation based on artificially mixed speech utterances. The fragment decoder produces significantly lower error rates than a conventional recogniser, and mimics the pattern of human performance whereby performance increases as the target-masker ratio is reduced below -3 dB.
منابع مشابه
Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملA reading list of recent advances in speech synthesis
This is a companion paper to my keynote talk at ICPhS 2015. It provides a guide to help readers familiarise themselves with recent advances in speech synthesis, with an emphasis on approaches that might provide useful tools to investigate speech, particularly by constructing experimental stimuli for perceptual experiments.
متن کامل